-
Notifications
You must be signed in to change notification settings - Fork 637
[Iluvatar GPU] Adapt VL model #4313
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
Thanks for your contribution! |
b5fe3b3
to
d0a687a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adapts VL (Vision-Language) models for Iluvatar GPU hardware, implementing platform-specific optimizations and configurations to support multimodal inference on Iluvatar devices.
- Pin paddleformers version to 0.3.0 for compatibility
- Implement Iluvatar-specific attention backend optimizations for VL models
- Add support for text-image processing operations and memory management
- Provide comprehensive documentation with installation and usage examples
Reviewed Changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 3 comments.
Show a summary per file
File | Description |
---|---|
requirements_iluvatar.txt | Pin paddleformers to version 0.3.0 for stability |
fastdeploy/worker/iluvatar_worker.py | Configure Paddle flags for multimodal support |
fastdeploy/worker/iluvatar_model_runner.py | Add VL model initialization and rope embedding handling |
fastdeploy/worker/gpu_model_runner.py | Import additional Iluvatar-specific operations |
fastdeploy/model_executor/models/ernie4_5_vl/modeling_resampler.py | Disable fused matmul bias for Iluvatar platform |
fastdeploy/model_executor/models/ernie4_5_vl/image_op.py | Add Iluvatar platform support for image operations |
fastdeploy/model_executor/models/ernie4_5_vl/dfnrope/modeling.py | Disable fused matmul bias for Iluvatar platform |
fastdeploy/model_executor/layers/rotary_embedding.py | Add Iluvatar-specific rotary embedding handling |
fastdeploy/model_executor/layers/attention/iluvatar_attn_backend.py | Implement VL-specific attention metadata and tensor handling |
docs/zh/get_started/installation/iluvatar_gpu.md | Add Chinese documentation for VL model usage |
docs/get_started/installation/iluvatar_gpu.md | Add English documentation for VL model usage |
custom_ops/setup_ops.py | Register additional CUDA operations for Iluvatar support |
a58db43
to
f2c3ff5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
custom_ops/setup_ops.py
Outdated
"gpu_ops/sample_kernels/top_k_renorm_probs.cu", | ||
"gpu_ops/text_image_index_out.cu", | ||
"gpu_ops/text_image_gather_scatter.cu", | ||
"gpu_ops/extract_text_token_output.cu", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extract_text_token_output已经被废弃,麻烦一并给删掉吧,包括算子实现、单测等等
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yuanlehome 这里指的是删掉custom_ops/gpu_ops/extract_text_token_output.cu,test/operators/test_extract_text_token_output.py和cpp_extentions.cc里关于extract_text_token_output的注册?除了这三处之外,setup_ops.py里"gpu_ops/extract_text_token_output.cu"在metax_gpu里也用到了,我如果删了cu文件的实现会对这里造成影响吧
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yuanlehome 已经删了,麻烦看下删的对不对
002cbd8
@@ -1,101 +0,0 @@ | |||
// Copyright (c) 2024 PaddlePaddle Authors. All Rights Reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个custom op确认没有地方使用了么
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
是的,废弃了
在天数硬件上适配VL模型